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INTRODUCTION 



One of the most important questions in cosmology is 
to identify the fundamental model underpinning the 
vast amount of observations nowadays available. The so- 
called "cosmological concordance model" is based on the 
cosmological principle (i.e., that the Universe is isotropic 
and homogeneous, at least on large enough scales) and 
on the hot Big Bang scenario, complemented by an in- 
flationary epoch. This remarkably simple model is able 
to explain with only half a dozen free parameters obser- 
vations spanning a huge range of time and length scales. 
Since both a cold dark matter (CDM) and a cosmological 
constant (A) component are required to fit the data, the 
concordance model is often referred to as "the ACDM 
model" . It is however important to keep in mind that at 
this stage the ACDM model is not a model in the sense 
attributed to the word by particle physicists, but rather 
a phenomenological scenario that appears to be able to 
explain the vast majority of observations with a great 
economy of free parameters. 

In the classical approach to statistics, model s (or hy- 
othe ses) can never be proved true, only falsified . IPoppeiJ 
1953 ). for example, argued that theories always remain 
"infinitely improbable" regardless of the amount of evi- 
dence gathered in their favour. However, in the context of 
Bayesian inference support can be accrued for a model if 
the observed data verify predicti ons made by t he model 
but not by competing models (see lJavned ((2003)). This is 



ABSTRACT 

While Bayesian model selection is a useful tool to discriminate between com- 
peting cosmological models, it only gives a relative rather than an absolute 
measure of how good a model is. Bayesian doubt introduces an unknown bench- 
mark model against which the known models are compared, thereby obtaining 
an absolute measure of model performance in a Bayesian framework. 

We apply this new methodology to the problem of the dark energy equation 
of state, comparing an absolute upper bound on the Bayesian evidence for a 
presently unknown dark energy model against a collection of known models 
including a flat ACDM scenario. We find a strong absolute upper bound to 
the Bayes factor B between the unknown model and ACDM, giving B < 3. 
The posterior probability for doubt is found to be less than 6% (with a 1% 
prior doubt) while the probability for ACDM rises from an initial 25% to just 
over 50% in light of the data. We conclude that ACDM remains a sufficient 
phenomenological description of currently available observations and that there 
is little statistical room for model improvement. 

the subject of Bayes ian model selection (see e.g. iTrottal 
(|2008l ): llYottal (|2007l ') for applications to the cosmological 
context): given a set of competing models, the Bayes fac- 
tor gives a measure of the relative performance of each 
model in explaining the data. This program naturally 
prefers models that provide a good fit with the fewest 
number of free parameters, thus implementing a quanti- 
tative version of Occam's razor. 

Although Bayesian model selection can identify the 
best model in a given set of known models, it has no 
way of indicating whether the absolute quality of the 
preferred model is high or low. However, it seems desir- 
able to be able to gauge the absolute performance of a 
model in a Bayesian sense, rather than just its relative 
performance with respect to known competitors. In par- 
ticular, this seems crucial for deciding whether the set 
of known models includes the true model. 

The purpose of this paper is to build on the notion 
of Bayesian doubt introduced by Sta rkman et al.l ()2008l ) 
to develop and apply a Bayesian technique for model 
discovery, focusing in particular on the nature of dark 
energy. The structure of this paper is as follows: in sec- 
tion [2] we recall the notion of Bayesian doubt and intro- 
duce a new procedure for estimating an upper bound for 
the Bayes factor in favour of doubt. We next employ this 
procedure in section [3] to assess the state of our knowl- 
edge of the dark energy equation of state, focusing on 
the status of the current ACDM concordance model. We 
present our results in terms of the posterior probability 
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for doubt and for ACDM in section 2] and discuss our 
conclusions in section [5] 



2 BAYESIAN MODEL DISCOVERY 

In this section we review the concept of Bayesian doubt 
and explain how this can lead to model discovery. 



2.1 The notion of Bayesian doubt 

Bayes ian doubt, as introduced by IStarkman et alj 
l|2008l 1. is an extension of Bayesian model selection. It 
seeks to determine a scale quantifying the absolute qual- 
ity of a model, as opposed to the relative performance of 
two models, given by their Bayes factor. The key idea of 
Bayesian doubt is that the general statistical character- 
istics of what would be recognised as a 'good' model are 
known, even if the specifics of the model are not. 

We begin by introducing a hypothetical unknown 
model X which has the characteristics of what would be 
considered a good model, to be defined below. This ideal- 
ized good model then acts as a benchmark against which 
known models can be compa red using standard Bay esian 
model selection. Following IStarkman et al.l ((2008:), we 
define 'doubt', T), as the posterior probability of this un- 
known model: 



V=p{X\d)^ 



p(d\X)p{X) 



= 1 + 



p(d) 



p{d\X)p{X) 



(1) 



where {A^i} (J = 1, . . . , A^) is the set of A'' known models 
and d are the data. In the above expression, p{X) is the 
prior probability for the model X, in other words, the 
prior probability that our list of known models does not 
contain the true model. p[Mi) is the prior probability 
of model Mi and p{d\Mi) is the Bayesian evidence for 
model Mi, given by 



p{d\M^)^ / de^p{d\e„M^)p{e,\M^), 



(2) 



where 9i are the parameters of model Mi- p{d\9i,Mi) 
is the likelihood function for model Mi, and p{6i\Mi) is 
the prior probability of the parameters of model Mi- 

Once we have chosen the level of prior doubt by 
defining the value oip{X), based on a principle of indif- 
ference we assume for simplicity that the prior probabil- 
ities for the known A'^ models are all equal, i.e. 



P{M^) 



(l-p(X)). 



(3) 



We single out the ACDM model as one of the set of 
known models, and, looking ahead, refer to it as our 
baseline model. Therefore it is useful to rewrite Eq. ([1} 
as 



V= 1 + 



{Bi. 



BxA 



-PiX) \ 
P{X) ) 



(4) 



where we have introduced the Bayes factor 

p{d\Mr) 



Bxj 



p{d\M, 



(5) 



and the average Bayes factor between ACDM and each 
of the known models: 



1 



(6) 



j=i 



(Note that the sum over models Mj includes j = A. and 
therefore (B,a> > l/N.) 

Rather than looking at T> directly, one can also con- 
sider the relative change in doubt TZ, given by the ratio 
of posterior to prior doubt: 



7^; 



V 



= U(x) + (i-p(x)) 



(BiA> 



(7) 



p{X) V BxA 

A necessary condition for doubt to grow {"R. > 1) is 

f^ « 1, (8) 

i.e., that the Bayes factor between model X and 
ACDM be much larger than the average Bayes factor 
between the known models and ACDM. 

However, for ACDM to be genuinely doubted it is 
not sufficient that TZ> 1. One has also to require that the 
probability for ACDM itself decreases, i.e., that p(A|d) < 
p(A). Applying again Bayes theorem, one finds that the 
ratio of the posterior probability for ACDM to its prior 
probability is given by 



7^A 



P(AM) 
p(A) 



((l-p(X))(B,A)+p(X)i3xA)-^ (9) 



Hence to gather genuine doubt against ACDM we require 
that both conditions TZ> 1 and TZa < 1 be fulfilled. 



2.2 Upper bound on the evidence of the 
unknown model 

In order to apply Bayesian doubt to the problem of cos- 
mological model selection, it is necessary to estimate the 
evidence of t he unknown model, p( d\X). The approach 
suggested bv lStarkman et al.l (|2008l ) was to calibrate the 
value of p(d\X) on simulated data sets from the best 
among the known models. This has been shown to lead 
to model discovery for a toy linear model. However, in 
the cosmological context it would be very computation- 
ally expensive to implement, even given fast algori thms 
to co mpute the evidence, such as MultiNe st ^j^eroz et al.l 
I2OO9I ') or the Savage-Dickey density ratio l|Trottall20o'7[ r 
In this paper, we put forward a different, more eco- 
nomical approach, which aims at computing an abso- 
lute upper bound for p{d\X). Since our aim is to investi- 
gate the dark energy sector, in the following we focus on 
the dark energy equation of state, w{z). We cannot, of 
course, compute the evidence for X explicitly since its 
parameterization of w{z) is unspecified. Since the un- 
known model X is to provide a benchmark value for the 
evidence of the known models, it should be designed to 
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provide a good fit to the available data, including cos- 
mic microwave background (CMB), matter power spec- 
trum (mpk) and supernovae type la (SNIa) observations. 
Therefore, the unknown model should have a high degree 
of flexibility. At the same time, we do not wish to incur 
the Occam's razor penalty coming from the high number 
of free parameters usually associated with a very flexible 
model. This is because we are seeking to build a phe- 
nomenological description for w{z) which, if model X is 
to be a 'good' model, should arise from an underlying, 
presently unknown theory with a small number of free 
parameters. 

In order to have the advantages of a flexible (and 
therefore well- fitting) unknown model (i.e. low x'^/dof), 
without incurring a penalty for having a large number of 
free parameters, we define the evidence of the unknown 
model via the upper bound on the Bayes factor between 
the ACDM baseline model and a stand-in model with a 
very flexible w{z) (as specified in section [33] below). The 
absolute upper bound on the Bayes factor Bxa between 
the unknown model X an d ACDM (denote d by a sub- 
script A) is given by (see [Gordon fc Trottal (|2007l ') and 
references therein for details), 



I In Bij I Odds 



Strength of evidence 



Bxa < Bxa — exp 



-TliXx 



2 \ 

Xa) 



(10) 



We have defined the best-fit chi-squared as minus 2 the 
log-likelihood at the best-fit point, 9*: 



xt = ~2hip{d\e„M,)\ 



(11) 



where i = X, A. 

The bound of Eq. ([lOp arises by putting a posteriori 
the prior probability for the parameters of the stand- 
in model into a delta-function located at the observed 
maximum likelihood value, i.e. by replacing p{0x\Mx) 
in Eq. ([2} with 5{9x — dx)- While this prior choice has 
no Bayesian justification (for it is inappropriate to use a 
posteriori information to determine the prior), it does 
lead to an absolute upper bound on the relative evi- 
dence between the baseline ACDM model and the un- 
known model. In order to calculate the absolute bound of 
Eq. ((To}, all that is needed is the difference between the 
best fit log-likelihood (or chi-squared) of the two models, 
Ax^ = Xx ~ XA) which can be easily computed. Since 
the ACDM model is nested within the unknown model 
(i.e., the unknown model reverts to ACDM for a specific 
choice of its parameters leading to m(z) = —1), it follows 
that Ax^ ^ 0. Therefore it is clear that by construction 
Bxa > 1 always, i.e., that our unknown model is always 
at least as good as ACDM. 

By inspecting Eq. (|10p . one might be tempted to 
think that this upper bound on the Bayes factor merely 
translates in Bayesian terms the old goodness-of-fit x^ 
test. For if ACDM is a "bad" model (on whatever scale 
one wishes to define this), the value of Xa will be large 
and thus the Bayes factor in favour of the unknown 
model will be large, as well. Thus one might think that 
Eq. (|10p simply rephrases the well-known rule-of-thumb 
of x^/dof ^ 1. However, this is not the case, for the 



< 1.0 


< 3: 1 


Inconclusive 


1.0 


~3: 1 


Weak evidence 


2.5 


~ 12 : 1 


Moderate evidence 


5.0 


~ 150 : 1 


Strong evidence 



Table 1. Empirical scale for evaluating the strength of ev- 
idence from the Bayes factor Bij between two models (so- 
called 'Jeffreys' scale'). The right— most column gives our con- 
vention for denoting the different levels of eviden ce above 
these thresholds, following [Gordon fc Trottal ||2007|) . 



X^/dof ~ 1 rule only applies asymptotically (for ti — )■ oo 
number of data points) and only if the data points are 
independent, Gaussian distributed. Those conditions are 
almost invariably not met in the cosmological context. 
For instance, it is not even clear how one would define the 
concept of degrees of freedom for the CMB data, given 
that the Ce's are not independent and are not Gaus- 
sian distributed. In the case of SNIa observations, the 
X^/dof ~ 1 criterion is satisfied for ACDM by construc- 
tion, for the value of the intrinsic dispersion for the SNe 
is a djusted in such a way t o require this to be the case, see 
e.g. [Kowalski et al.l (|2008l ). Therefore one cannot mean- 
ingfully use this kind of absolute goodness-of-fit tests on 
such a data set. 

Instead, the upper bound given by Eq. (|10|) does not 
require any assumption about asymptotic behaviour, nor 
that the data are Gaussian distributed, nor independent. 
One only needs to be able to compute the log-likelihood 
at the best-fit point, including relevant correlations as 
necessary. 

Finally, the upper bound of Eq. (|10p could also be 
computed using the highest best-fit log-likelihood of all 
the known models, at no extra computational cost. This 
would give the absolute upper bound achievable among 
the class of known models. Although we do not pursue 
this approach in this paper, we recommend including in 
any Bayesian model comparison a model X with evi- 
dence obtained via this procedure, for this will give an 
estimate of the maxinmm possible level of doubt that can 
arise from the known models with their assigned priors. 



2.3 Behaviour of doubt and posterior 
probability for ACDM 

In the following, we will adopt the absolute upper bound 
Bxa of Eq. (|10|) as an estimator for the Bayes factor of 
the unknown model X, and explore the consequence in 
terms of doubt and in terms of the posterior probability 
for ACDM. It is clear from Eqs. Q and (O that for a 
given level of prior doubt p{X), the posterior models' 
probabilities are controlled uniquely by the two quanti- 
ties (BiA) and Bxa- The result can be expected to fall 
within one of the three scenarios below, which we will ex- 
amine from two points of view: using doubt D and using 
the upper bound to the Bayes factor Bxa as measures of 
doubt. While there is something to be said for employing 
Bxa (whose value can be translated into a strength of 
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evidence via the Jeffreys' scale, given in Table [T} as a 
criterion for goodness of fit, it turns out that doubt can 
shed some light onto how large Bxa should be to have 
genuine doubt without referring to the (in some sense 
arbitrary calibrated) Jeffreys' scale. 

• Case 1: Bxa 2> 1 and {Bn^) ^ 1: in this case, 
the unknown model has a much better evidence than 
ACDM, which in turn has about the same evidence as 
the other known models. As the Bayes factor Bxa > 1, 
we should expect there to be a significant amount of 
doubt, O ~ 1. And indeed, from Eq. ([l]) the doubt is, 
assuming p{X) <C 1: 



O; 



p(X)BxA 



(12) 



for p{X)BxA 3> 1. In other words, we are inclined to 
believe that there is a better model that we have not 
yet thought of if the Bayes factor between the unknown 
model and ACDM is sufficiently large to override the 
smallness of the prior doubt, Bxa > l/p{X) (notice the 
independence of the Jeffreys' scale). The change in the 
probability for ACDM itself is given by, from Eq. ((9]), 



7^Af« {l+piX)B 






(13) 



While the doubt grows {D — >■ 1) the probability for 
ACDM declines, TZa <^ 1. In this case, one is led to 
genuinely doubt ACDM. 

• Case 2: Bxa ^ 1 and BiA <S l(i 7^ A): in this case, 
ACDM is clearly the best of the known models, as the 
Bayes factors between the known models and ACDM are 
all small. Again, as the Bayes factor Bxa ^ 1 favors 
the unknown model, we should be doubting our list of 
models. As (BiA) ~ 1/A'^, we find 



V: 



1 



Np{X)B, 



(14) 



for p{X)BxA 2> 1/A''. This seems to contradict the re- 
sult of Case 1. However, as we noted above, the condi- 
tion that D « 1 is only necessary but not sufficient for 
doubt to arise. We need to examine the relative change 
in probability for ACDM which is given by 



7^A 



N 



+ p{X)BxA 



(15) 



Requiring 7?,a < 1 leads to the stronger condition 
p{X)BxA ^ 1, as in Case 1 above. If the latter con- 
dition is not fulfilled, doubt will grow at the expenses of 
the probability of the other known models, as the prior 
probability mass which was spread among the N known 
models according to Eq. ((3| gets redistributed between 
X and ACDM_ 

• Case 3: Bxa ~ 1: in this case, the upper bound 
on the Bayes factor between the unknown model and 
ACDM is of order unity. This means that we should have 
no reason to doubt our set of models. The expression for 
doubt Eq. ([1} simplifies to 



V: 



1 + 



{B,a) 
Pi.X) 



(16) 



In order to reach a high level of doubt D « 1, we would 
need {BiA) /p{X) ~ 0. Clearly, this is only the case if we 
allow for p{X) 3> (-B^a) ^ 1/A^, i.e. if we are starting 
off with a prior doubt which is larger than the indiffer- 
ence prior on the known models, which is usually not the 
case. Otherwise, if the Bayes factor Bxa is larger than 
the prior doubt p{X), we can regard our list of mod- 
els as reasonably complete, and perform Bayesian model 
comparison among the list of known models. Of course, 
this procedure must be rep eated once new data arrives 
(see IStarkman et al.l (120091 ) for the procedure that this 
entails). Note that again we do not need to refer to the 
Jeffreys' scale, but need to compare the average Bayes 
factor (-BiA) with our prior doubt p{X). 

In summary, we are led to doubt the current baseline 
ACDM model only if the rule of thumb 



p{X)BxA » 1 



(17) 



is satisfied, which corresponds to either Case 1 above 
or to Case 2 when the condition for TZa < 1 is also 
fulfilled. If Eq. p7[) is satisfied, we are guaranteed that 
doubt will grow and at the same time the probability 
for the ACDM model will decrease, thus signaling the 
opportunity for model discovery. All this is accomplished 
without referring to Jeffreys' scale. 



3 APPLICATION OF DOUBT TO THE 
DARK ENERGY EQUATION OF STATE 

3.1 The known models 

We take the flat ACDM model as our baseline 
model, described by the usual 6-parameters set 6 = 
{As,ns,u)},,ujc,^A,Ho}, where As is the amplitude of 
scalar fluctuations, ns is the spectral index, uib the phys- 
ical baryon density, ujc the cold dark matter density, Q,a 
the density parameter for the cosmological constant and 
Ho the Hubble constant today. We assume purely adia- 
batic fluctuations throughout this paper. 

We define the other models in the known models 
list by increasing the complexity of the baseline model 
in successive steps. First, we only add a non-zero curva- 
ture parameter, 0^ 7^ 0, with a fiat prior in the range 
—0.3 ^ f^K ^ 0.3, aki n to the "Astronomer' s prior" 
adopted and justified in lVardanvan et al.l l|2009f) . Alter- 
natively, another model is obtained by only adding an 
effective equation of state parameter for dark energy, 
w 7^ —1, with a flat prior in the range —1.3 ^ ui ^ —0.7 
while keeping Q.K. = Q flxed. Finally, a fourth model with 
8 free parameters is obtained by adding both 0^ 7^ 
and u; 7^ — 1 with the above priors to the ACDM base- 
line model. 

One could in principle further increase the complex- 
ity of the known models, e.g. by adopting more com- 
plex descriptions for w[z), such as the so-called CPL 
parameterization in terms of the parameters {woyWa)- 
However, those models have in general a lower evidence 
than ACDM, as they are penalized for the ir wasted pa- 
rameter space, see e.g. iLiddle et al.l (|2006a ). As a conse- 
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quence, they are expected not to contribute significantly 
to (BiA), and therefore their infiuence on posterior doubt 
would be minor, see section 14.21 for details. One could 
also add to the list alternative explanations for the ap- 
parent acceleration of the Universe, such as for exam- 
ple modified gravity models, provided one can compute 
their evidence numerically |Hcavcns ct al. 2007). As the 
main goal of this paper is to introduce the methodology 
related to Bayesian doubt, we however restrict our con- 
siderations to the four models listed above. We comment 
in section IT2I on how our results would change if the list 
of known models would be further enlarged. 

Finally, in this work we do not address the problem 
of the fine tuning of the value of the cosmological con- 
stant itself. All models we consider here suffer equally 
from the fine tuning problem, i.e., the fact that the mea- 
sured value of the cosmological constant is some 120 or- 
ders of magnitude smaller than the "natural" scale set by 
the Planck mass if A arises from quantum fluctuations 
of the vacuum. Anthropic reasoning in the context of the 
Multiverse has been invoked to explain the smallness of 
the cosmological constant, and while Bayesian reasoning 
could be brought to bear on the effectiveness of such an 
"explanation", we shall not consider this aspect further 
in the present paper. 



3.2 Parameterization of the unknown model 

Our discussion so far has been completely general, 
sidestepping the crucial issue of how to evaluate Eq. (|10p 
for the unknown model. In order to make further 
progress, we have to make some assumptions regarding 
the class of alternative models the unknown model X is 
supposed to come from. 

As we are interested in the dark energy sector, we 
will assume that the phenomenology of model X is such 
that it only leads to modifications to the right-hand- 
side of Einstein equations. In other words, we do not in- 
vestigate models that modify General Relativity except 
for those whose only impact is a change in the effec- 
tive energy-momentum tensor. Under this assumption, 
a model X is fully specified once we give its redshift- 
dependent equation of state of dark energy w{z). Notice 
that we also implicitly assume that the Universe is well 
described by a FRW isotropic cosmology. If one wished to 
include a more general class of alternative models from 
which to draw X, one could do so by parameterizing 
their phenomenology in a suitable way. One could de- 
fine even more general classes of alternative models, for 
example by fitting parameterized functions to the obser- 
vations. However, we do not pursue this approach here, 
because such a modeling of the data would be devoid 
of any physical insight and would achieve a purely de- 
scriptive fit to the observations. To see why this is not 
desirable, one only has to push this approach to its ex- 
treme consequences: given any data collection, there is 
always a "model" that fits the data perfectly. This model 
is obtained by simply choosing the value of the "theory" 
to be identical to the observed value for each of the ob- 



servations. Of course, nobody would ever consider such 
a model to be a valid scientific theory, because we de- 
mand that the latter should have explanatory power, 
not be a simple description of the data. Therefore, it 
seems sensible to require from the outset that our un- 
known model X be part of a class of physical theories, 
with phenomenological parameters that are linked with 
the physical framework of the class of models considered 
(here, FRW isotropic Universes with time-varying dark 
energy equation of state and otherwise standard cosmol- 
ogy)- 

Therefore we are left with the task of parameter- 
izing «;(z) as a function of redshift, and then use its 
functional form to compute the Ax^ between the un- 
known model and the ACDM baseline model. To this 
purpose, we employ the Parame terized Po st Friedman 
(PPF) pr escription developed bv lHu fc Sawicki (20071); 
IHuI (I2OO3). The PPF prescription was originally intro- 
duced to describe the behavior of theories of modified 
gravity in a metric framework that describes leading or- 
der deviations from general relativity (subject to certain 
assumptions). However, it was also found be well-suited 
for describing the evolution of dark energy models that 
cross the so-called "phantom divide", w = —1. Crossing 
this phantom divide in models with fixed sound speed 
would lead to divergences in the pressure perturbations. 
Hence models that are phenomenologically described by 
a time-varying w{z) that crosses w — —1 must be de- 
scribed micro-physically by a theory of scalar-fields with 
a varying speed of sound, e.g. DGP-type models. 



3.3 Numerical implementation and data sets 

Below, we investigate the behaviour of doubt for differ- 
ent combinations of cosmological data sets. In particular, 
we are interested in studying doubt as the constraining 

power of the combined data incr eases. 

We modified the CosmoMC (|Lewis fc Bridlell2002l ')') 
parameter estimation package to sample the addi- 
tional parameters Wi = w{zi), where Zi are uniformly 
spaced at n = 10 red shift value, ranging from z = 
Farig et al] (HoOSa b) wrote a plugin to CAMB 



0...1.5 _- ■ ___. _, ,___„ 

(jLewis et al.ll2000l ) that implements the PPF prescrip- 
tion and is freely available for downloaqj, which we 
adopted for this work. The PPF module uses cubic 
splines to interpolate w between these points, and as- 
sumes w{z > 1.5) = w{z = 1.5). 

We adopted the 307 SNe la from t he "Union" 
data set compiled bv lKowalski et all (|2008l 'l. The CMB 
data and likelihood used was the WMAP five year 
data set (|Dunklev et al.ll2009l ). iTegmark et"ai] (|2006l ) 
provided the data and likelihood code for the matter 
power spectrum using SDSS DR4. The evidence for the 
known models i s computed using the publicly available 
MuhiNest code (JFeroz fc Hobsonl2008l : lFeroz et al.ll2009l : 



http : //caunb . inf o/ppf / 
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iTrotta et alj |2008| ) . which implements the nested sam- 
phng al gorithm, employed as an add-in sam pler to Cos- 
moM C iJLewis fc Bridld 120021 ') and CAMB JLewis et all 
|2000D . 

The gist of nested sampling is that the multi- 
dimensional evidence integral of Eq. ((2)1 is recast into a 
one-dimensional integral. This is accomplished by defin- 
ing the prior volume a:: as da; = p(6)d9 so that 



,A,./ 



p{e)de 



(18) 



£(9)>A 



where the integral is over the parameter space enclosed 
by the iso-likelihood contour jC{d) — X. So x{X) gives the 
volume of parameter space above a certain level A of the 
likelihood. Then the Bayesian evidence, Eq. ([2]), can be 
written as 



pW = / C{x)dx, 



(19) 



where jC{x) is the inverse of Eq. (|18p . Samples from C{X) 
can be obtained by drawing uniformly samples from the 
likelihood volume within the iso-contour surface defined 
by A. The 1-dimensional integral of Eq. (|19|l can be ob- 
tained by simple quadrature, thus 



p(d)«^/:(xOm, 



(20) 



where the weights are Wi = i(xi_i — xt+i). The stan- 
dard deviation on the value of the log-evidence can be es- 
timated as {H/niweY' , where H is the negative relative 
entropy and nnve is the number of live points adopted, 
which in our case is nuvo = 4000 (see iFeroz fc HobsonI 
l|2008l ) for details). 

The best-fit y^ required to evaluate Eq. (|10p is 
obtained by performing a Metropolis-Hastings Markov 
Chain Monte Carlo (MCMC) reconstruction of the 
posterior of the 16 parameters model comprising the 
ACDM parameters 9 and the above 10-parameters de- 
scription of w{z). We gather a total of 5 x 10^ samples 
in 8 parallel chains and verify that the Gel man & Ru- 
bin mixing criterion IjGelman &: Rubinlll992r ) is satisfied 
(i.e., R <C 0.1, where R is the inter-chain variance di- 
vided by the intra-chain variance). 

MCMC is rather geared towards exploring the bulk 
of the posterior probability density, and is not partic- 
ularly optimised to look for the absolute best-fit value. 
This is especially true for high dimensional parameter 
spaces. Therefore, we expect that the best-fit x^ values 
recovered via MCMC for the 16-dimensional model X 
are going to be systematically higher than the true best- 
fit. In order to estimate and correct for this numerical 
bias, we sampled via MCMC a 16 dimensional Gaussian 
of unit variance, recovered the best-fit x^ and compared 
it with the true best-fit value, repeating the procedure 
5000 times. This gives an estimate of the numerical bias, 
under the assumption (which is valid locally) that the 
posterior distribution of model X is close to Gaussian 
in the immediate vicinity of the best-fit. We found that 
the MCMC systematically overestimates the best-fit x'^ 



value by 0.94 ± 0.14, and therefore subtracted this esti- 
mate from the recovered x^ best-fit value for model X. 
We also verified that the numerical bias in recovering the 
best-fit x^ for a 6-dimensional parameter space (such as 
ACDM) is negligible in comparison. 



4 RESULTS AND DISCUSSION 

We now proceed to evaluate the doubt and the poste- 
rior probability of ACDM for various combinations of 
cosmological data sets. 



4.1 Model comparison outcome including 
doubt 

In Table [2J we present the estimated upper limit on the 
Bayes factor between ACDM and model X as well as 
the Bayes factors with respect to ACDM for the other 
known models. Among the known models, we confirm 
what many others have shown - that ACDM is the 
best-fit known model, or at least that no other model 
is demonstrably better. Thus, we find an inconclusive 
model comparison result (according to the Jeffreys' scale. 
Table [l| when comparing ACDM and a model with a 
free (but constant) w. We also find weak to moderate 
evidence (1 < InB < 2.5) against spatially curved mod- 
els when compared to a fiat A CDM, in agreement wit h 
the more detailed findings of IVardanvan et al.l (|2009l ). 
Finally, there is weak to moderate evidence against the 
most complex of the known models - one exhibiting 
both w 7^ — 1 and Q,k y^ 0. This is in good agreement 
wit h the results of prev ious m ore thorough a nalyses, 
e.g. iLiddle et al] l|2006al lbl) and iLi et all (2009). From 
this, ordinary Bayesian model comparison concludes that 
ACDM is still the best of the known models (at least for 
the limited range of alternative models considered here). 
Most importantly, in the table, we report the im- 
provement in the best-fit log-likelihood obtained over 
ACDM by using X, and use this to compute an ab- 
solute upper bound to the Bayes factor via Eq. (|10|l . 
We notice that the improvement in the best-fit is fairly 
modest for all the data sets considered, supportive of 
the general sentiment in the community that ACDM is 
in good agreement with available observations and that 
therefore there is little room for statistical improvement 
of the quality of fit. This is in part because it is very 
hard to improve the quality of fit by changing w{z) - 
observables are usually a double integral of w{z), and 
therefo re insensitive to features i n the equation of state 
(see e.glHuterer fc Turned (|l999l ): lMaor et al.l (|200ll ) and 
IClarksonI ( 20091 ) V As a consequence, even a highly flexi- 
ble w{z) model such as the one we used here to describe 
X will lead to only small observable departures from the 
standard cosmological constant scenario. It is important 
to keep in mind that such statements depend strongly 
on the statistics one employs to examine the models. 
For example, the standard likelihood function for CMB 
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-1.3 <w < -0.7 
n„ = 0.0 



w = -1.0 

-0.3 < Qk < 0.3 

\nBjA 



-1.3 <w < -0.7 

-0.3 < n„ < 0.3 

In Bj A 



"Unknown" model X 
Ax2 InBjfA 



{B,a) 



BxA 



CMB only 
CMB -I- SN 
CMB -I- mpk 
CMB -I- SN -I- mpk 



0.18 ±0.09 
-0.37 ±0.09 
-0.50 ±0.08 
-0.48 ±0.09 



-1.03 ±0.09 
-1.30 ±0.09 
-2.57 ±0.08 
-2.51 ±0.09 



-1.09 ±0.09 
-1.63 ±0.09 
-2.69 ±0.08 
-2.73 ±0.09 



-1.21 ±0.3 0.61 ±0.2 

-2.34 ±0.3 1.17 ±0.2 

-0.88 ±0.3 0.44 ±0.2 

-2.15 ±0.3 1.08 ±0.2 



0.72 ±0.03 1.83 ±0.3 

0.54 ±0.02 3.22 ±0.5 

0.44 ±0.01 1.55 ±0.2 

0.44 ±0.01 2.93 ±0.6 



Table 2. In the first three columns, we report the Bayes factors between the known models and ACDM for different combinations 
of data sets, where In Bj a < favours ACDM. The fourth columns gives Ax^ = Xx ~ ^A' ^^^ improvement in the best-fit log- 
likelihood obtained by using model X (specified in the text) over ACDM. The last column gives the corresponding absolute upper 
bound to the Bayes factor between model X and ACDM. 



data is insensit ive to most of the reported anomalies i n 
the low-£ CMB lCopi et all (|201(]| '): lBennett et al.l ||2010D . 

The interesting consequence from the point of view 
of doubt is that this translates into strong upper limits 
for the Bayes factor between model X and ACDM (third 
from last column of Table [2]). We find that the upper 
limit on the Bayes factor Bxa (last column of Table [2]) 
for all the data combinations is less than 3, just around 
"weak evidence" threshold (InB = 1, see Table [l|. From 
our discussion in section 12.21 this means that the nec- 
essary condition for doubt to grow, p{X)BxA 3> 1, is 
not met for any reasonable doubt prior choice. We re- 
mind the reader at this point that our unknown model 
X has been designed in such a way as to exhibit the 
maximum possible evidence against ACDM. Therefore, 
if even such a model cannot achieve a significant level 
of evidence against ACDM, one can safely conclude that 
no other reasonable model will. Of course this conclusion 
depends both on the set of observations we have consid- 
ered and on the particular likelihood function we have 
ascribed to that data. New statistical treatments can 
bring to light anomalies in the existing data, while new 
observations might contain new unexpected features. 

Our results in terms of posterior probability for 
doubt and for the ACDM model are shown in Table |3] 
for two different assumptions regarding the level of prior 
doubt, p{X) = 10"^ and p{X) = 10"''. These two 
choices are representative of a range that we think might 
bracket reasonable prior expectations: a prior doubt of 
1% is certainly not too large, while leaving a little space 
for updating our models beliefs in the light of data. A 
prior doubt of 10"'' reflects the fact that surely we have 
to allow for a one-in-a-million chance that our current 
list of known models might be incomplete, and that the 
true underlying dark energy model might still be undis- 
covered. 

Table |3] contains the level of doubt, which is up- 
dated from the prior by using the results of Table [5] for 
the models' evidences. We find an increase in doubt by a 
factor of ~ 6 for the most constraining data combination 
(CMB+SN+mpk). This however is largely a consequence 
of the doubt model acquiring some of the probability 
mass of the known models other than ACDM, as dis- 
cussed under Case 3 in section [23] Indeed, the posterior 
probability of ACDM is observed to increase (last col- 



O 
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o 

(D 
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w 
o 

Q. 



10" 



50% probability 



ACDIVI 



Doubt 
(unknown model X) 



prior doubt=10 



prior 



CMB 



CMB 
+SNe 



CMB CMB 

+mpk +SNe 

+mpk 

Combination of data sets 



Figure 1. Posterior probability for doubt for the 
ACDM model as a function of different combinations of data 
sets. The probability of ACDM increases from the initial 25% 
to just about over 50%, while the probability of doubt in- 
creases from the initial 1% to just over 6%, mostly as a conse- 
quence of acquiring probability from the other 3 known mod- 
els considered in the analysis. This signals that ACDM re- 
mains the most valid statistical description of the data. 



umn of Table[3|, from the initial prior value p(A) ~ 0.25 
to just over 50% for the most constraining data combi- 
nation. This result is almost independent of the choice of 
prior doubt. The behaviour of the posterior probability 
for doubt and ACDM for a prior choice p{X) = 10"^ is 
shown in Fig. [Jl as a function of the data sets employed. 
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Prior doubt: p{X) ■ 



Doubt D 
10~^ Prior doubt: 



p{X) = 10- 



Posterior for ACDM, p{A\d) 
(with p{X) = 10^2 and p(A) ^ 0.25) 



CMB only 


(2.50 ±0.2) X 10-2 


(2.54 ±0.2) X 10-6 


CMB+SN 


(5.69 ±0.5) X 10-2 


(5.97 ±0.6) X 10-« 


CMB+mpk 


(3.46 ±0.3) X 10-2 


(3.54 ±0.4) X 10-8 


CMB+SN+mpk 


(6.29 ±0.8) X 10-2 


(6.64 ±0.9) X 10-'3 



0.34 ±0.01 
0.44 ±0.01 
0.55 ± 0.02 
0.53 ± 0.02 



Table 3. First two columns: Posterior doubt for different data sets combinations and two prior doubt assumptions. Last column: 
posterior probability for the ACDM model when allowing for the possibility of a 1% prior doubt on the completeness of our list 
of known models. 



Known models 

N 



Required Ax^ for p{X\d) = 
piX) = 10-2 / p(X) 



p(A|d) 
= P{A)// 



4 


-6.4 


4 


-5.5 


10 


-4.6 


10 


-9.2 


20 


-3.2 


102 


-18.4 


50 


-1.4 


103 


-27.6 



Table 4. Improvement in the x^ of ACDM required for the 
unknown model X to have the same a posteriori probability 
as ACDM. First two columns: as a function of the number of 
known models, N, assuming a fixed prior doubt p{X) = 10—2. 
Last two columns: assuming a fixed fractional prior doubt, 
p{X) = p(A)//, and as a function of /. It is assumed that 
the evidence of the known models is much smaller than the 
evidence for ACDM. 



4.2 Impact of the addition of further known 
models 

We now proceed to estimate the robustness of our find- 
ings with respect to expanding the set of known models. 
As has been mentioned above, the Hst of three alterna- 
tive known models to ACDM we adopted in this work is 
far from complete. However, even if a larger number of 
models N were included in the known models list, it is 
reasonable to assume that the value of the average evi- 
dence between the known models and ACDM would scale 
approximately as oc 1/A'^, for there is no other known 
model that presently can achieve a substantially higher 
evidence than ACDM (if this was the case, then this 
other best model would take the place of ACDM and 
become our baseline model which we seek to doubt - or 
rather the dominant model in our list of models where 
we intend to compute the doubt for the whole list). By 
equating Eqs. Q and @ we can solve for the value of 
Ax^ required for the posterior on doubt to be equal to 
the posterior of ACDM. This gives the approximate con- 
dition (assuming that {BiA) ~ 1/A'' and that p{X) <^ 1): 



Ax^ ^21n(piV). 



(21) 



So the value of Ax^ required for posterior doubt to 
reach the posterior for ACDM scales logarithmically with 
the number of known models. Assuming a prior doubt 
p{X) — lO^'^ one obtains the values of Ax^ listed in 
the first column of Table |4] as a function of A'^. As more 
known doubts are put on the table, it becomes easier to 
doubt ACDM. From this scaling, it would appear that 
the improvement of Ax^ = —2.3 for model X reported in 



p(ACDM|d) 




lines) and for 



Xa 



Xx 



Figure 2. Posterior for doubt (dashed 

ACDM (solid lines) as a function of — Ax2 

suming a fixed prior doubt p{X) = 10—2. Different curves are 

for different numbers of known models, A'^ = 1,4, 10, 20 (from 

thin to thick), assuming that (BiA) ~ ^/N. 



Table[2]for the data combination cmb+SN would lead to 
a larger probability of doubt than for ACDM if we had 
assumed a list of A^ > 30 known models, rather than 
just three. As illustrated in Fig. [2l this effect is however 
a consequence of our choice of spreading the level of prior 
probabilities among the TV known models, while assum- 
ing a fixed p{X), see Eq. ((3]). As A'' increases, the prior 
for ACDM decreases while the prior doubt is kept con- 
stant. As a consequence, it becomes easier for the former 
to "catch up" with the latter. 

In order to avoid this spurious effect, one could 
choose to set the prior doubt as a fraction 1// (/ > 1) 
of the prior probability for ACDM, i.e., to require that 
the relative probability between X and A is constant a 
priori, independent of the number of known models. We 
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thus replace the prescription of Eq. Q by 

p(A) = l(l-p(X)) (22) 

p(X) = ^ = (iV/ + l)-i (23) 

and by equating the posterior doubt with the posterior 
for ACDM we obtain tire following requirement for the 



Ax' = -4 In/. 



(24) 



This is now independent of the number of known models 
A'^ and it only depends logarithmically on the prior doubt 
fraction, /. From the last two columns of Table |4] we 
can see that even if doubt started off a factor of just 
/ = 4 less probable than ACDM, a Ax' = —5.5 would 
be required in order for the unknown model X to become 
as probable as ACDM. Increasing the prior gap between 
doubt and ACDM (i.e., increasing /) only makes the 
requirements on the x' improvement more taxing. 

In summary, once the effect of adding extra models 
to the known models' list is corrected for by introduc- 
ing the fractional prior doubt /, we find that the im- 
provement in the x' found for various combinations of 
data sets is insufficient to doubt ACDM. If the unknown 
model starts off being a factor of 4 less probable than 
ACDM, one would need an improvement in the x' of 
about 5 units to reverse the situation in the posterior, 
which is quite a bit larger than the maximum x' im- 
provement observed from the data. 



5 CONCLUSIONS 

The aim of this paper was to extend the application of 
Bayesian model selection to define an absolute scale of 
goodness of fit for models, rather than just a relative 
one, such as the Jeffreys' scale. We showed how the no- 
tion of doubt can be used to evaluate the evidence in 
favour of a missing 'ideal' unknown model in the list 
of known cosmological models. We demonstrated how a 
useful absolute upper bound to the Bayesian evidence of 
an unknown model can be derived and how this can be 
implemented in the context of Bayesian model compari- 
son. 

Doubt can be incorporated in the framework of 
model comparison to help us decide whether our cur- 
rently "best" model is s t atistic ally adequate for the 
data at hand. iKunz et al.l l|2006r ) introduced the notion 
of Bayesian complexity to decide whether the available 
models are over-complex with respect to the constrain- 
ing power of the data. Bayesian doubt can act as a use- 
ful complement to Bayesian complexity, giving an in- 
dication of whether the current models are statistically 
insufficiently to describe the data. Used in conjunction, 
doubt and complexity can thus extend the power and 
domain of applicability of Bayesian model comparison. 
Of course statistical considerations should never replace 
proper physical insight: all of our arguments are re- 
stricted to the statistical aspects of data modeling. But 



for the problem of dark energy, where most "models" are 
of a phenomenological kind, it seems to us that a rigor- 
ous statistical framework can help deciding whether new 
theoretical explorations might be fruitful. Other domains 
where we expect doubt to be useful include the descrip- 
tion of the spectral distribution of CMB anisotropies 
and the problem of anomalous al ignments between mul- 
tipol e s in the CMB dTeemark et aLll2003l : ISchwarz et"aLl 
l2004l : lLand fc MagueiidboOSh . 

We have applied this methodology to the problem 
of dark energy, adopting a list of known models includ- 
ing possible extensions of the dark energy sector and 
non-zero curvature of the Universe. In principle, many 
more models could be added to the list of known models. 
However we argued that our results are robust against 
adding further models to the list of known models. We 
found that current CMB, matter power spectrum and 
SNIa data do not require the introduction of an alterna- 
tive model to the baseline flat ACDM model. The upper 
bound of the Bayesian evidence for a presently unknown 
dark energy model against ACDM gives only weak ev- 
idence in favour of the unknown model. Since this is 
an absolute upper bound, we conclude that ACDM re- 
mains a sufficient phenomenological description of cur- 
rently available observations. 
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